AITopics

Country:

Europe > Austria (0.28)
Europe > Ireland > Leinster > County Dublin > Dublin (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.64)

Neural Information Processing SystemsApr-25-2026, 04:56:42 GMT

A.1 Conjugate Derivations Cross-Entropy Loss: L(h,y) = cX

Pc i=1 yi = 1is satisfied, otherwise f (y) = by duality. A.2 Experiments on Binary Classification with Exponential Loss Here we present the results on a binary classification task over a synthetic dataset of 100 dimensional gaussian clusters. For Σ, similar to [23], we sample a diagonal matrix D, where each entry is sampled uniformly from a specified range, and a rotation matrix U from a HAAR distribution, giving Σ = UDUT. For the source data, we sample µ 1s,µ+1s,Σ 1s,Σ+1sas specified above with k = 0. Now to create a distribution shifted data of various severity, we sample µ 1t,µ+1t,Σ 1t,Σ+1tas specified above with k = 1, which are then used to sample the shifted data as follows: Exponential Loss for Binary Classification Let z be the classification score hθ(x). For logistic training loss, conjugate adaptation loss would default to entropy with sigmoid probability.

artificial intelligence, machine learning, sgd, (17 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-10-2026, 11:44:50 GMT

UnsupervisedNoiseAdaptiveSpeechEnhancement byDiscriminator-ConstrainedOptimalTransport

Consequently,thenoisy-to-clean transformation learned from the training data cannot be suitably applied to handle the testing noise, resulting in limited enhancement performance.

artificial intelligence, arxivpreprintarxiv, machine learning, (16 more...)

Country:

North America > United States > Colorado > Boulder County > Boulder (0.04)
Asia > Taiwan (0.04)
Asia > Japan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Neural Information Processing SystemsFeb-8-2026, 02:05:41 GMT

2e907f44e0a9616314cf3d964d4e3c93-Paper.pdf

algorithm, cost vector, opponent, (12 more...)

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.14)
Europe > Austria > Vienna (0.14)
Europe > Russia (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.63)

Ding, Ziyi, Zhang, Xiao-Ping

GaussDetect-LiNGAM:Causal Direction Identification without Gaussianity test

arXiv.org Machine LearningDec-4-2025

We propose GaussDetect-LiNGAM, a novel approach for bivariate causal discovery that eliminates the need for explicit Gaussianity tests by leveraging a fundamental equivalence between noise Gaussianity and residual independence in the reverse regression. Under the standard LiNGAM assumptions of linearity, acyclicity, and exogeneity, we prove that the Gaussianity of the forward-model noise is equivalent to the independence between the regressor and residual in the reverse model. This theoretical insight allows us to replace fragile and sample-sensitive Gaussianity tests with robust kernel-based independence tests. Experimental results validate the equivalence and demonstrate that GaussDetect-LiNGAM maintains high consistency across diverse noise types and sample sizes, while reducing the number of tests per decision (TPD). Our method enhances both the efficiency and practical applicability of causal inference, making LiNGAM more accessible and reliable in real-world scenarios.

gaussdetect-lingam, gaussianity test, independence test, (15 more...)

arXiv.org Machine Learning

2512.03428

Country:

Asia > Middle East > Jordan (0.05)
Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Alavi, Khashayar, Yeltay, Zhastay, Flek, Lucie, Karimi, Akbar

More Agents Helps but Adversarial Robustness Gap Persists

arXiv.org Artificial IntelligenceNov-11-2025

When LLM agents work together, they seem to be more powerful than a single LLM in mathematical question answering. However, are they also more robust to adversarial inputs? We investigate this question using adversarially perturbed math questions. These perturbations include punctuation noise with three intensities (10, 30, and 50 percent), plus real-world and human-like typos (WikiTypo, R2ATA). Using a unified sampling-and-voting framework (Agent Forest), we evaluate six open-source models (Qwen3-4B/14B, Llama3.1-8B, Mistral-7B, Gemma3-4B/12B) across four benchmarks (GSM8K, MATH, MMLU-Math, MultiArith), with various numbers of agents n from one to 25 (1, 2, 5, 10, 15, 20, 25). Our findings show that (1) Noise type matters: punctuation noise harm scales with its severity, and the human typos remain the dominant bottleneck, yielding the largest gaps to Clean accuracy and the highest ASR even with a large number of agents. And (2) Collaboration reliably improves accuracy as the number of agents, n, increases, with the largest gains from one to five agents and diminishing returns beyond 10 agents. However, the adversarial robustness gap persists regardless of the agent count.

large language model, machine learning, natural language, (20 more...)

2511.07112

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Dimidov, Valeriu, Hawlader, Faisal, Jafarnejad, Sasan, Frank, Raphaël

Cleaning Maintenance Logs with LLM Agents for Improved Predictive Maintenance

arXiv.org Artificial IntelligenceNov-10-2025

Economic constraints, limited availability of datasets for reproducibility and shortages of specialized expertise have long been recognized as key challenges to the adoption and advancement of predictive maintenance (PdM) in the automotive sector. Recent progress in large language models (LLMs) presents an opportunity to overcome these barriers and speed up the transition of PdM from research to industrial practice. Under these conditions, we explore the potential of LLM-based agents to support PdM cleaning pipelines. Specifically, we focus on maintenance logs, a critical data source for training well-performing machine learning (ML) models, but one often affected by errors such as typos, missing fields, near-duplicate entries, and incorrect dates. We evaluate LLM agents on cleaning tasks involving six distinct types of noise. Our findings show that LLMs are effective at handling generic cleaning tasks and offer a promising foundation for future industrial applications. While domain-specific errors remain challenging, these results highlight the potential for further improvements through specialized training and enhanced agentic capabilities.

artificial intelligence, large language model, natural language, (18 more...)

2511.05311

Genre: Research Report > New Finding (1.00)

Industry:

Automobiles & Trucks (1.00)
Transportation > Ground > Road (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

arXiv.org Artificial IntelligenceOct-30-2025

Explainable Disentanglement on Discrete Speech Representations for Noise-Robust ASR

Gopal, Shreyas, Anshul, Ashutosh, Li, Haoyang, Yeo, Yue Heng, Liu, Hexin, Chng, Eng Siong

Discrete audio representations are gaining traction in speech modeling due to their interpretability and compatibility with large language models, but are not always optimized for noisy or real-world environments. Building on existing works that quantize Whisper embeddings for speech-to-unit modeling, we propose disentangling semantic speech content from background noise in the latent space. Our end-to-end model separates clean speech in the form of codebook tokens, while extracting interpretable noise vectors as quantization residue which are supervised via a lightweight classifier. We show that our approach improves alignment between clean/noisy speech and text, producing speech tokens that display a high degree of noiseinvariance, and improves ASR performance. Keeping Whisper frozen, we show an 82% reduction in error rate compared to Whisper, and 35% improvement over baseline methods on the VBDemand test set. Further analyses show that the learned token space generalizes well to both seen and unseen acoustic conditions.

artificial intelligence, machine learning, natural language, (14 more...)

2510.2515

Country: Asia (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Janßen, Nick, Schaller, Melanie, Rosenhahn, Bodo

Benchmarking M-LTSF: Frequency and Noise-Based Evaluation of Multivariate Long Time Series Forecasting Models

arXiv.org Artificial IntelligenceOct-7-2025

Abstract--Understanding the robustness of deep learning models for multivariate long-term time series forecasting (ML TSF) remains challenging, as evaluations typically rely on real-world datasets with unknown noise properties. We propose a simulation-based evaluation framework that generates parameterizable synthetic datasets, where each dataset instance corresponds to a different configuration of signal components, noise types, signal-to-noise ratios, and frequency characteristics. These configurable components aim to model real-world multivariate time series data without the ambiguity of unknown noise. This framework enables fine-grained, systematic evaluation of M-L TSF models under controlled and diverse scenarios. Our analysis reveals that all models degrade severely when lookback windows cannot capture complete periods of seasonal patters in the data. S-Mamba and Autoformer perform best on sawtooth patterns, while R-Linear and iTransformer favor sinusoidal signals. White and Brownian noise universally degrade performance with lower signal-to-noise ratio while S-Mamba shows specific trend-noise and iTransformer shows seasonal-noise vulnerability. Further spectral analysis shows that S-Mamba and iTransformer achieve superior frequency reconstruction. This controlled approach, based on our synthetic and principle-driven testbed, offers deeper insights into model-specific strengths and limitations through the aggregation of MSE scores and provides concrete guidance for model selection based on signal characteristics and noise conditions. IME series forecasting plays a crucial role across diverse fields such as energy systems [1]-[3], meteorology [4], [5], traffic flow modeling [6], [7] or the modeling of sensor networks [8], [9]. Reliable forecasts support proactive decision-making, effective risk management, and efficient planning. As high-resolution temporal data becomes increasingly available, the need for robust and scalable forecasting models has grown more important than ever. A time series represents data points ordered in time and can be categorized as either univariate, when consisting of a single variable, or multivariate, when involving multiple interdependent variables [10].

data mining, machine learning, natural language, (19 more...)

2510.049

Genre: Research Report > New Finding (0.93)

Industry: Energy (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Neural Information Processing SystemsSep-30-2025, 13:01:14 GMT

Adaptive Multi-Column Deep Neural Networks with Application to Robust Image Denoising

Stacked sparse denoising auto-encoders (SSDAs) have recently been shown to be successful at removing noise from corrupted images. However, like most denoising techniques, the SSDA is not robust to variation in noise types beyond what it has seen during training. We present the multi-column stacked sparse denoising autoencoder, a novel technique of combining multiple SSDAs into a multi-column SSDA (MC-SSDA) by combining the outputs of each SSDA. We eliminate the need to determine the type of noise, let alone its statistics, at test time. We show that good denoising performance can be achieved with a single system on a variety of different noise types, including ones not seen in the training set. Additionally, we experimentally demonstrate the efficacy of MC-SSDA denoising by achieving MNIST digit error rates on denoised images at close to that of the uncorrupted images.

adaptive multi-column deep neural network, application, robust image denoising, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)